RNA-seq analysis for Jagesh Shah group at Longwood.
Contact Lorena Pantano (lpantano@hsph.harvard.edu) for additional details.
The most recent update of this html document occurred: Thu Feb 2 17:08:53 2017
The sections below provide code to reproduce the included results and plots.
## Warning: package 'knitr' was built under R version 3.3.2
2017-02-02 17:09:01 INFO::Using gene counts calculated from the Sailfish transcript counts.
remove unwanted variation with ruvseq
out of 29331 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 29, 0.099%
LFC < 0 (down) : 7, 0.024%
outliers [1] : 0, 0%
low counts [2] : 6750, 23%
(mean count < 1)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results
NULL
Differential expression file at: mice_model_jck_wt_de.csv
Normalized counts matrix file at: mice_model_jck_wt_log2_counts.csv
Plot top 9 genes
| baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | absMaxLog2FC | |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000051397 | 1242.75431 | 0.6229989 | 0.0838691 | 7.428226 | 0.00e+00 | 0.0000000 | 0.6229989 |
| ENSMUSG00000039043 | 562.33701 | 0.2928119 | 0.0475347 | 6.159960 | 0.00e+00 | 0.0000082 | 0.2928119 |
| ENSMUSG00000021678 | 718.54090 | 0.3667241 | 0.0602229 | 6.089448 | 0.00e+00 | 0.0000085 | 0.3667241 |
| ENSMUSG00000036853 | 903.87767 | 0.6877411 | 0.1160272 | 5.927411 | 0.00e+00 | 0.0000174 | 0.6877411 |
| ENSMUSG00000024479 | 2146.54521 | 0.4234369 | 0.0786794 | 5.381802 | 1.00e-07 | 0.0003337 | 0.4234369 |
| ENSMUSG00000053977 | 33.15227 | -0.6117497 | 0.1195484 | -5.117172 | 3.00e-07 | 0.0011696 | 0.6117497 |
| ENSMUSG00000069833 | 4681.72848 | 0.3295937 | 0.0727160 | 4.532616 | 5.80e-06 | 0.0188305 | 0.3295937 |
| ENSMUSG00000024098 | 3855.34422 | 0.1982697 | 0.0445446 | 4.451041 | 8.50e-06 | 0.0241688 | 0.1982697 |
| ENSMUSG00000026051 | 154.27817 | 0.5108982 | 0.1156679 | 4.416942 | 1.00e-05 | 0.0251669 | 0.5108982 |
| ENSMUSG00000005131 | 83.16467 | 0.4995291 | 0.1177629 | 4.241820 | 2.22e-05 | 0.0334434 | 0.4995291 |
| ENSMUSG00000032291 | 36.05124 | -0.4976191 | 0.1171541 | -4.247560 | 2.16e-05 | 0.0334434 | 0.4976191 |
| ENSMUSG00000034456 | 96.03794 | -0.4372152 | 0.1019052 | -4.290412 | 1.78e-05 | 0.0334434 | 0.4372152 |
| ENSMUSG00000042348 | 1041.36966 | 0.2703403 | 0.0633873 | 4.264900 | 2.00e-05 | 0.0334434 | 0.2703403 |
| ENSMUSG00000045094 | 59.54043 | 0.4631928 | 0.1078332 | 4.295456 | 1.74e-05 | 0.0334434 | 0.4631928 |
| ENSMUSG00000086807 | 26.88535 | 0.5063060 | 0.1192938 | 4.244193 | 2.19e-05 | 0.0334434 | 0.5063060 |
| ENSMUSG00000026638 | 757.34531 | 0.2013363 | 0.0483002 | 4.168434 | 3.07e-05 | 0.0433711 | 0.2013363 |
| ENSMUSG00000063130 | 82.04451 | 0.4753051 | 0.1152858 | 4.122841 | 3.74e-05 | 0.0498076 | 0.4753051 |
| ENSMUSG00000048834 | 76.93171 | -0.4840475 | 0.1190624 | -4.065494 | 4.79e-05 | 0.0602491 | 0.4840475 |
| ENSMUSG00000098374 | 386.91352 | 0.2602137 | 0.0642747 | 4.048462 | 5.16e-05 | 0.0613942 | 0.2602137 |
| ENSMUSG00000041782 | 989.93007 | -0.4816925 | 0.1196945 | -4.024348 | 5.71e-05 | 0.0643193 | 0.4816925 |
Here, I only considered the condition to get the DE genes. These will contain genes where the mean on each condition is different, and will help to detect genes that are always UP or DOWN.
out of 29376 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 12, 0.041%
LFC < 0 (down) : 21, 0.071%
outliers [1] : 93, 0.32%
low counts [2] : 7276, 25%
(mean count < 1)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results
NULL
Differential expression file at: mice_model_de.csv
Normalized counts matrix file at: mice_model_log2_counts.csv
Plot top 9 genes
| baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | symbol | description | absMaxLog2FC | |
|---|---|---|---|---|---|---|---|---|---|
| ENSMUSG00000038071 | 471.90441 | -1.6987912 | 0.3718355 | 42.36484 | 0.00e+00 | 0.0000739 | Npy6r | neuropeptide Y receptor Y6 | 1.6987912 |
| ENSMUSG00000036853 | 903.87767 | -1.5830983 | 0.2869988 | 37.82342 | 0.00e+00 | 0.0003157 | Mcoln3 | mucolipin 3 | 1.5830983 |
| ENSMUSG00000069833 | 4681.72848 | -0.6672764 | 0.1298276 | 37.13760 | 0.00e+00 | 0.0003157 | Ahnak | AHNAK nucleoprotein (desmoyokin) | 0.6672764 |
| ENSMUSG00000075588 | 144.23482 | 0.7921397 | 0.2347404 | 36.35577 | 1.00e-07 | 0.0003465 | Hoxb2 | homeobox B2 | 0.7921397 |
| ENSMUSG00000031789 | 35.35275 | -3.7971354 | 0.7057465 | 35.18578 | 1.00e-07 | 0.0004899 | Cngb1 | cyclic nucleotide gated channel beta 1 | 3.7971354 |
| ENSMUSG00000036181 | 619.18393 | 0.8003920 | 0.2328268 | 34.74142 | 1.00e-07 | 0.0005067 | Hist1h1c | histone cluster 1, H1c | 0.8003920 |
| ENSMUSG00000015957 | 199.62821 | 1.3588321 | 0.2812957 | 31.85778 | 6.00e-07 | 0.0016989 | Wnt11 | wingless-type MMTV integration site family, member 11 | 1.3588321 |
| ENSMUSG00000083670 | 31.92052 | 8.0509876 | 1.5994637 | 31.65871 | 6.00e-07 | 0.0016989 | NA | NA | 8.0509876 |
| ENSMUSG00000029651 | 53.47416 | -2.4367913 | 0.5825888 | 30.60947 | 1.00e-06 | 0.0025117 | Mtus2 | microtubule associated tumor suppressor candidate 2 | 2.4367913 |
| ENSMUSG00000026051 | 154.27817 | -1.7647000 | 0.3524652 | 28.65756 | 2.60e-06 | 0.0058162 | 1500015O10Rik | RIKEN cDNA 1500015O10 gene | 1.7647000 |
| ENSMUSG00000032754 | 292.25578 | -0.2370884 | 0.1961784 | 26.82702 | 6.40e-06 | 0.0128041 | Slc8b1 | solute carrier family 8 (sodium/lithium/calcium exchanger), member B1 | 0.2370884 |
| ENSMUSG00000032487 | 56.09453 | -1.2328051 | 0.3928699 | 26.48593 | 7.50e-06 | 0.0138369 | Ptgs2 | prostaglandin-endoperoxide synthase 2 | 1.2328051 |
| ENSMUSG00000067344 | 25.85143 | 4.4037536 | 1.1508516 | 26.11994 | 9.00e-06 | 0.0152382 | NA | NA | 4.4037536 |
| ENSMUSG00000014813 | 487.60607 | -1.9735020 | 0.4110123 | 25.30678 | 1.33e-05 | 0.0195423 | Stc1 | stanniocalcin 1 | 1.9735020 |
| ENSMUSG00000082361 | 414.91219 | -1.4051898 | 0.2835380 | 25.32573 | 1.32e-05 | 0.0195423 | Btc | betacellulin, epidermal growth factor family member | 1.4051898 |
| ENSMUSG00000062683 | 112.20944 | 10.2865518 | 2.1356198 | 25.11978 | 1.46e-05 | 0.0200473 | Atp5g2 | ATP synthase, H+ transporting, mitochondrial F0 complex, subunit C2 (subunit 9) | 10.2865518 |
| ENSMUSG00000025155 | 696.96438 | 0.2600794 | 0.1055875 | 24.35990 | 2.10e-05 | 0.0243356 | Dus1l | dihydrouridine synthase 1-like (S. cerevisiae) | 0.2600794 |
| ENSMUSG00000082820 | 17.79334 | -1.4058141 | 0.8192381 | 24.44528 | 2.02e-05 | 0.0243356 | NA | NA | 1.4058141 |
| ENSMUSG00000096054 | 122.25463 | -1.0371696 | 0.2376876 | 24.56297 | 1.91e-05 | 0.0243356 | Syne1 | spectrin repeat containing, nuclear envelope 1 | 1.0371696 |
| ENSMUSG00000073643 | 1237.69324 | 0.8044436 | 0.4554556 | 23.14621 | 3.76e-05 | 0.0414239 | Wdfy1 | WD repeat and FYVE domain containing 1 | 0.8044436 |
| ID | Description | GeneRatio | BgRatio | pvalue | p.adjust | qvalue | |
|---|---|---|---|---|---|---|---|
| GO:0006814 | GO:0006814 | sodium ion transport | 5/29 | 198/20996 | 7.00e-06 | 0.0038892 | 0.0030372 |
| GO:0035725 | GO:0035725 | sodium ion transmembrane transport | 4/29 | 99/20996 | 1.01e-05 | 0.0038892 | 0.0030372 |
This will detect genes that change differently over time in the two conditions.
We used diana function inside cluster R package to separate genes using the expression correlation with time. Clusters with more than 3 genes are shown. Significant genes were those with log2FC bigger than 0.1 and FDR < 5%. The file with the information of this analysis is clusters_genes.tsv.
Working with 33 genes
Working with 23 genes after filtering
Working with 33 genes
Working with 20 genes after filtering
Working with 33 genes
Working with 25 genes after filtering
(useful if replicating these results)
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.1 (Sierra)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 methods stats graphics grDevices utils
[8] datasets base
other attached packages:
[1] RUVSeq_1.6.2 EDASeq_2.6.2
[3] ShortRead_1.30.0 GenomicAlignments_1.8.4
[5] Rsamtools_1.24.0 Biostrings_2.40.2
[7] XVector_0.12.1 BiocParallel_1.6.6
[9] vsn_3.40.0 DEGreport_1.11.2
[11] quantreg_5.29 SparseM_1.72
[13] dplyr_0.5.0 cluster_2.0.5
[15] org.Mm.eg.db_3.3.0 AnnotationDbi_1.34.4
[17] clusterProfiler_3.0.5 DOSE_2.10.7
[19] gridExtra_2.2.1 logging_0.7-103
[21] tximport_1.0.3 DESeq2_1.12.4
[23] SummarizedExperiment_1.2.3 Biobase_2.32.0
[25] GenomicRanges_1.24.3 GenomeInfoDb_1.8.7
[27] IRanges_2.6.1 S4Vectors_0.10.3
[29] BiocGenerics_0.18.0 pheatmap_1.0.8
[31] CHBUtils_0.1 edgeR_3.14.0
[33] limma_3.28.21 gplots_3.0.1
[35] reshape_0.8.6 ggplot2_2.2.0
[37] myRfunctions_0.1 knitr_1.15.1
[39] rmarkdown_1.1
loaded via a namespace (and not attached):
[1] colorspace_1.2-7 hwriter_1.3.2 qvalue_2.4.2
[4] MatrixModels_0.4-1 topGO_2.24.0 affyio_1.42.0
[7] codetools_0.2-15 splines_3.3.1 R.methodsS3_1.7.1
[10] GOSemSim_1.30.3 DESeq_1.24.0 geneplotter_1.50.0
[13] Formula_1.2-1 Nozzle.R1_1.1-1 annotate_1.50.1
[16] GO.db_3.3.0 R.oo_1.20.0 graph_1.50.0
[19] readr_1.0.0 assertthat_0.1 Matrix_1.2-7.1
[22] lazyeval_0.2.0 acepack_1.3-3.3 htmltools_0.3.5
[25] tools_3.3.1 igraph_1.0.1 coda_0.19-1
[28] gtable_0.2.0 affy_1.50.0 reshape2_1.4.1
[31] DO.db_2.9 Rcpp_0.12.7 gdata_2.17.0
[34] preprocessCore_1.34.0 rtracklayer_1.32.2 stringr_1.1.0
[37] gtools_3.5.0 XML_3.98-1.4 MASS_7.3-45
[40] zlibbioc_1.18.0 scales_0.4.1 aroma.light_3.2.0
[43] BiocInstaller_1.22.3 RColorBrewer_1.1-2 yaml_2.1.14
[46] biomaRt_2.28.0 rpart_4.1-10 latticeExtra_0.6-28
[49] stringi_1.1.2 RSQLite_1.0.0 highr_0.6
[52] genefilter_1.54.2 GenomicFeatures_1.24.5 caTools_1.17.1
[55] chron_2.3-47 matrixStats_0.51.0 bitops_1.0-6
[58] evaluate_0.10 lattice_0.20-34 labeling_0.3
[61] GSEABase_1.34.1 plyr_1.8.4 magrittr_1.5
[64] R6_2.2.0 Hmisc_3.17-4 DBI_0.5-1
[67] foreign_0.8-67 survival_2.39-5 RCurl_1.95-4.8
[70] nnet_7.3-12 tibble_1.2 KernSmooth_2.23-15
[73] locfit_1.5-9.1 grid_3.3.1 data.table_1.9.6
[76] digest_0.6.10 xtable_1.8-2 tidyr_0.6.0
[79] R.utils_2.4.0 munsell_0.4.3